An Introduction to rtweet:
Getting & Analyzing Data
from the Twitter API

SatRdaysDC

Matthew Hendrickson

2020/03/28

Topics

  1. About Me
  2. rtweet
  3. The Twitter API
  4. Tweeting from R
  5. search_tweets()
  6. get_trends()
  7. stream_tweets()
  8. Social Network Analysis - get_friends() get_followers()
  9. get_timelines()
  10. get_favorites()

About Me

  • Social Scientist by Training
    • Psychology & Music %>%
    • More Psychlogy %>%
    • Law & Policy
  • Higher Education Analyst by Trade
  • R User by Stumbling
    • Excel %>%
    • SPSS GUI %>%
    • SPSS Syntax %>%
    • SQL %>%
    • R

rtweet


The Twitter API

  • No longer need a Twitter developer account
    • Walthrough available for developer accounts
  • You DO need a Twitter account
  • Authorize API by calling an rtweet function like search_tweets()
  • Twitter API limits 18,000 results / 15 min
    • retryonratelimit = TRUE
  • Must be used in accordance with Twitter’s developer terms

The Setup

Tweeting

… from R

https://memecrunch.com/meme/C0QML/no-way

search_tweets()

stream_tweets()

Social Network Analysis

Social Network Analysis Output

Timelines

get_timelines()

Favorites

get_favorites()

Upset Plot

Upset Plot

But wait - there’s more!

For a full list of what’s included in the data pull, go to the Tweet Data Dictionary

A few of interest:

  • created_at = UTC time of Tweet
    • df$created_at - 18000 # 5 hours in seconds for Eastern Time
  • text = content of the Tweet
  • source = utility used to post the Tweet
  • is_retweet = TRUE/FALSE if this was a re-tweet
  • favorite_count = number of times the Tweet was favorited
  • retweet_count = number of times the Tweet was re-tweeted
  • hashtags = string of all hashtags used
  • urls_url = string of all urls
  • mentions_screen_name = string of screen_names mentioned
  • Fields on re-tweets, geolocation, user info

All available fields

print(colnames(search_tweets("#rladies", n = 1, include_rts = FALSE)))
#>  [1] "user_id"                 "status_id"              
#>  [3] "created_at"              "screen_name"            
#>  [5] "text"                    "source"                 
#>  [7] "display_text_width"      "reply_to_status_id"     
#>  [9] "reply_to_user_id"        "reply_to_screen_name"   
#> [11] "is_quote"                "is_retweet"             
#> [13] "favorite_count"          "retweet_count"          
#> [15] "quote_count"             "reply_count"            
#> [17] "hashtags"                "symbols"                
#> [19] "urls_url"                "urls_t.co"              
#> [21] "urls_expanded_url"       "media_url"              
#> [23] "media_t.co"              "media_expanded_url"     
#> [25] "media_type"              "ext_media_url"          
#> [27] "ext_media_t.co"          "ext_media_expanded_url" 
#> [29] "ext_media_type"          "mentions_user_id"       
#> [31] "mentions_screen_name"    "lang"                   
#> [33] "quoted_status_id"        "quoted_text"            
#> [35] "quoted_created_at"       "quoted_source"          
#> [37] "quoted_favorite_count"   "quoted_retweet_count"   
#> [39] "quoted_user_id"          "quoted_screen_name"     
#> [41] "quoted_name"             "quoted_followers_count" 
#> [43] "quoted_friends_count"    "quoted_statuses_count"  
#> [45] "quoted_location"         "quoted_description"     
#> [47] "quoted_verified"         "retweet_status_id"      
#> [49] "retweet_text"            "retweet_created_at"     
#> [51] "retweet_source"          "retweet_favorite_count" 
#> [53] "retweet_retweet_count"   "retweet_user_id"        
#> [55] "retweet_screen_name"     "retweet_name"           
#> [57] "retweet_followers_count" "retweet_friends_count"  
#> [59] "retweet_statuses_count"  "retweet_location"       
#> [61] "retweet_description"     "retweet_verified"       
#> [63] "place_url"               "place_name"             
#> [65] "place_full_name"         "place_type"             
#> [67] "country"                 "country_code"           
#> [69] "geo_coords"              "coords_coords"          
#> [71] "bbox_coords"             "status_url"             
#> [73] "name"                    "location"               
#> [75] "description"             "url"                    
#> [77] "protected"               "followers_count"        
#> [79] "friends_count"           "listed_count"           
#> [81] "statuses_count"          "favourites_count"       
#> [83] "account_created_at"      "verified"               
#> [85] "profile_url"             "profile_expanded_url"   
#> [87] "account_lang"            "profile_banner_url"     
#> [89] "profile_background_url"  "profile_image_url"

Thank you


@mjhendrickson


matthewjhendrickson


mjhendrickson

rtweet repo

This talk is freely distributed under the MIT License.
(So is rtweet!)

References